Auditory motivated front-end for noisy speech using spectro-temporal modulation filtering.

نویسندگان

  • Sriram Ganapathy
  • Mohamed Omar
چکیده

The robustness of the human auditory system to noise is partly due to the peak preserving capability of the periphery and the cortical filtering of spectro-temporal modulations. In this letter, a robust speech feature extraction scheme is developed that emulates this processing by deriving a spectrographic representation that emphasizes the high energy regions. This is followed by a modulation filtering step to preserve only the important spectro-temporal modulations. The features derived from this representation provide significant improvements for speech recognition in noise and language identification in radio channel speech. Further, the experimental analysis shows congruence with human psychophysical studies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

Using spectro-temporal features to improve AFE feature extraction for ASR

Previous work has shown that spectro-temporal features reduce WER for automatic speech recognition under noisy conditions. The spectro-temporal framework, however, is not the only way to process features in order to reduce errors due to noise in the signal. The two-stage mel-warped Wiener filtering method used in the “Advanced Front End” (AFE), now a standard front end for robust recognition, i...

متن کامل

Spectro-temporal Gabor features as a front end for automatic speech recognition

A novel type of feature extraction is introduced to be used as a front end for automatic speech recognition (ASR). Two-dimensional Gabor filter functions are applied to a spectro-temporal representation formed by columns of primary feature vectors. The filter shape is motivated by recent findings in neurophysiology and psychoacoustics which revealed sensitivity towards complex spectro-temporal ...

متن کامل

Hooking up spectro-temporal filters with auditory-inspired representations for robust automatic speech recognition

Spectro-temporal filtering has been shown to result in features that can help to increase the robustness of automatic speech recognition (ASR) in the past. We replace the spectro-temporal representation used in previous work with spectrograms that incorporate knowledge about the signal processing of the human auditory system and which are derived from Power-Normalized Cepstral Coefficients (PNC...

متن کامل

Normalization of spectro-temporal Gabor filter bank features for improved robust automatic speech recognition systems

Physiologically motivated feature extraction methods based on 2D-Gabor filters have already been used successfully in robust automatic speech recognition (ASR) systems. Recently it was shown that a Mel Frequency Cepstral Coefficients (MFCC) baseline can be improved with physiologically motivated features extracted by a 2D-Gabor filter bank (GBFB). Besides physiologically inspired approaches to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • The Journal of the Acoustical Society of America

دوره 136 5  شماره 

صفحات  -

تاریخ انتشار 2014